Report: RNA-velocity estimation and projection on UMAP - pseudotime - part 2

Jingtao Lilue and António Sousa - UBI-IGC (e-mail: agsousa@igc.gulbenkian.pt)

01/06/21





Request


This report includes the description of the second part of the pseudotime analysis which consisted in estimating RNA velocity and projecting the trajectories in UMAP embeddings obtained before with Seurat:


  1. Pseudotime of the integrated scRNA-seq samples Cre3 and Lox2 of Plasmodium chabaudi with RNA-velocity and projection in the UMAP obtained with Seurat (part 2).


RNA-velocity includes two steps:

  1. Generating the .loom file with the spliced and unspliced count matrices (done previously in the terminal):

    • this step was done using the velocyto CLI tool by running the bash script: ./velocyto_script.sh &> velocyto_log.log. The script and the respective logs can be found under the folder scripts. It was provided as input the sample folders generated with the cellranger pipeline, which contain the folder outs with the .bam alignment file and the filtered barcodes. In addition, it was given the GTF file of Plasmodium chabaudi.
  2. Estimating RNA-velocity from the spliced and unspliced matrices (described herein):

    • this step uses the ratio of spliced and unspliced RNA transcripts to estimate the RNA-velocities in order to determine the cell trajectories. This step was performed and it is described in this report. It includes the projection of the trajectories in the UMAP embeddings obtained previously with the Seurat software.


This analysis is based on the following tutorial: https://github.com/basilkhuder/Seurat-to-RNA-Velocity.






Import packages & Seurat data



The cell and cluster ids as well as the UMAP embeddings for each sample were imported. In addition the loom files generated with the velocyto CLI (v.0.17.17; La Manno et al., 2018) were also imported. velocyto was run with GNU parallel (v.20161222; Tange, 2011) to parallelize the jobs by the samples.






Parse loom based on Seurat


Filter out cells from the loom files that do not appear in the processed Seurat objects of the samples Cre3 and Lox2.

Initially the loom files of spliced and unspliced count matrices had 15543 and 5204 for the samples Cre3 and Lox2, respectively.

After filtering out cells that do not appear in the Seurat objects of the samples Cre3 and Lox2, it resulted in 15537 and 5176 for the samples Cre3 and Lox2, respectively.






Estimating RNA-velocity, projecting trajectories & pseudotime


RNA-vlocities and pseudotime were estimated with the python package scvelo (v.0.2.2; Bergen et al., 2020). Other packages used were os, random, anndata (v.0.7.5), pandas (v.1.1.3), numpy (v.1.19.2) and matplotlib (v.3.3.2) using python (v.3.8.3).


Estimate RNA-velocity & projecting trajectories through UMAP - Cre3


Spliced/unspliced fractions - Cre3


The percentage of overall spliced and unspliced fractions as well as discriminated by cluster is presented below.





RNA-velocity UMAP - Cre3


The RNA velocities projected into the cells or as streamlines are displayed below.




Unspliced fraction UMAP - Cre3


The unspliced fraction (the portion of unspliced counts over spliced plus unspliced counts) was projected in the UMAP below.




Pseudotime - Cre3


The velocity pseudotime UMAP is found below. To learn more about its meaning, please click in the link.






Identify trajectory genes by cluster - Cre3


Below it was identified the most important genes for the RNA-velocity analysis per cluster. They were exported to a table at: results/velocyto/tables/cre3_velocity_genes_by_cluster.tsv.

In addition 5 genes per cluster are represented below.






Speed & conherence - Cre3


The velocity length and conherence are presented below. Please have a look into the official documentation for a correct interpretation.







Estimate RNA-velocity & projecting trajectories through UMAP - Lox2


Spliced/unspliced fractions - Lox2


The percentage of overall spliced and unspliced fractions as well as discriminated by cluster is presented below.





RNA-velocity UMAP - Lox2


The RNA velocities projected into the cells or as streamlines are displayed below.






Unspliced fraction UMAP - Lox2


The unspliced fraction (the portion of unspliced counts over spliced plus unspliced counts) was projected in the UMAP below.






Pseudotime - Lox2


The velocity pseudotime UMAP is found below. To learn more about its meaning, please click in the link.






Identify trajectory genes by cluster - Lox2


Below it was identified the most important genes for the RNA-velocity analysis per cluster. They were exported to a table at: results/velocyto/tables/lox2_velocity_genes_by_cluster.tsv.

In addition 5 genes per cluster are represented below.






Speed & conherence - Lox2


The velocity length and conherence are presented below. Please have a look into the official documentation for a correct interpretation.